Week 6.2 - Research Ideation with AI

What We'll Cover

One of the most exciting — and most dangerous — uses of generative AI in research is brainstorming. AI can generate dozens of research ideas in minutes, suggest connections across disciplines you would never have considered, and reframe problems in unexpected ways. Recent research suggests that AI-generated ideas can even be rated as more novel than those produced by human experts.

But there is a dark side. AI brainstorming tools tend to converge on similar ideas, creating what researchers are calling an "idea monoculture." They anchor your thinking to their first suggestion. They make ideas feel novel simply because they are unexpected to you — even when the ideas are well-trodden in other fields. And they replace exactly the kinds of unstructured, serendipitous thinking that have historically produced the most transformative research breakthroughs.

This session will teach you how to use AI effectively for research ideation — the prompting strategies that actually work, the evidence on whether AI ideas are any good, the risks you need to manage, and when to put the laptop away entirely and go for a long walk instead.

📝 Prompt Engineering Fundamentals for Ideation

The quality of AI-generated ideas depends heavily on how you ask. A generic prompt ("Give me research ideas about climate change") will produce generic results. Structured prompting strategies force the model to think in specific ways, producing more diverse and useful output. Here are five strategies that work well for research ideation, each with a different cognitive purpose.

⛓️ Chain-of-Thought Prompting

Purpose: Forces structured, step-by-step reasoning instead of jumping to conclusions. The model must show its working, which exposes gaps and assumptions.

Example Prompt

Think step by step about what gaps exist in current research on [your topic]. For each gap, explain: (1) why the gap exists, (2) what methods could address it, and (3) what would change in the field if it were filled.

This strategy works because it prevents the model from pattern-matching to the most obvious answer. By requiring reasoning at each step, you get more considered output and can see where the model's logic is weak.

🎭 Role-Playing Prompting

Purpose: Shifts the model's perspective to generate ideas from a specific expert viewpoint, accessing different knowledge patterns in the training data.

Example Prompt

You are a senior researcher in [field] with 20 years of experience. A junior colleague asks you: "What are the three most important unanswered questions in our field that nobody is working on?" Explain your reasoning for each choice.

Role-playing works well because the model draws on different patterns depending on the persona. An "epidemiologist" will frame problems differently from a "health economist" even when discussing the same topic. Try multiple roles on the same question.

🚫 Constraint-Based Prompting

Purpose: Forces creative divergence by removing the most obvious paths. When you prohibit the obvious, the model must find alternative routes.

Example Prompt

Generate five research ideas about [topic] that do NOT involve [most common approach]. None of the ideas should use [standard method] or focus on [usual population/context]. What alternatives exist?

This is one of the most powerful ideation strategies. Most researchers are anchored to the dominant methods and populations in their field. By explicitly blocking them, you force both yourself and the AI to explore the margins where novel contributions are most likely.

👁 Adversarial Prompting

Purpose: Finds blind spots and weaknesses by asking the model to critique rather than create. The critical perspective often reveals more than the constructive one.

Example Prompt

What would a thoughtful critic say is fundamentally missing from current research on [topic]? What assumptions does the field take for granted that might be wrong? Where is the field most vulnerable to being overturned?

Adversarial prompting is underused. Researchers tend to ask AI "What should I study?" when the better question is "What is everyone getting wrong?" The critique-first approach generates ideas that challenge existing paradigms rather than extending them incrementally.

🌐 Cross-Disciplinary Prompting

Purpose: Generates unexpected connections by applying frameworks from one discipline to another. Many breakthrough papers come from exactly this kind of cross-pollination.

Example Prompt

How would a [physicist / economist / anthropologist / ecologist] approach the problem of [your research question]? What methods, frameworks, or theories from their field could offer a fresh perspective? Be specific about what they would do differently.

This strategy leverages one of AI's genuine strengths: it has been trained on text from many disciplines and can identify structural analogies that a specialist might miss. The key is to ask for specifics, not vague gestures toward "interdisciplinarity."

💡 A note on combining strategies: These strategies are most powerful when combined. Start with chain-of-thought to map the landscape, then use adversarial prompting to find gaps, then apply cross-disciplinary prompting to generate unexpected approaches to those gaps, and finally use constraint-based prompting to push beyond the first wave of obvious ideas. Run the same question through multiple strategies and compare the results — the differences will tell you which ideas are robust and which were artefacts of a particular framing.

📊 The Evidence — Can AI Actually Generate Good Ideas?

The question of whether AI can generate genuinely useful research ideas is no longer purely speculative. A growing body of research has attempted to measure this directly, with results that are both encouraging and cautionary.

📚 Key Study: Si et al. (2024) — "Can LLMs Generate Novel Research Ideas?"

This is the most rigorous study to date on AI ideation quality. Over 100 NLP researchers participated in a large-scale evaluation. Human experts and an LLM (using Claude 3.5 Sonnet as the backbone) each generated research ideas, which were then blind-reviewed by other experts on dimensions including novelty, feasibility, and excitement.

The headline finding: LLM-generated ideas were rated as significantly more novel than those produced by human experts. This surprised many in the research community — the AI was better at generating surprising, unexpected combinations of existing concepts.

But the caveats are critical: While novelty scores were higher, AI ideas were rated as weaker on feasibility. The AI generated ideas that sounded exciting but were harder to actually execute. Perhaps most importantly, the AI-generated ideas were significantly less diverse than human ideas — they clustered around similar themes and approaches, even when prompted to be diverse.

What this means for you: AI is good at producing surprising combinations of concepts — the kind of "What if we applied X to Y?" thinking that can spark genuine insight. But it is poor at judging whether those combinations will actually work in practice. The diversity problem is perhaps the most concerning finding: if you rely on AI for brainstorming, you may be exploring a narrower space than you think.

Read the full paper: arXiv:2409.04109

📈 What AI Does Well

Generating a high volume of ideas quickly — useful in early-stage brainstorming
Finding unexpected combinations across subfields within a discipline
Reframing problems using different theoretical lenses
Identifying structural analogies between different research areas
Producing ideas that score well on "surprise" — things you would not have thought of yourself

📉 What AI Does Poorly

Assessing feasibility — whether an idea can actually be executed with available resources
Generating genuinely diverse ideas — outputs tend to cluster around similar themes
Understanding the practical constraints of a specific research context
Distinguishing between "novel to you" and "novel to the field"
Knowing when an idea has already been explored and failed

⚙️ Girotra et al. (2023): Quantity vs. Quality

An earlier study by Girotra, Meincke, Terwiesch, and Ulrich (published via SSRN) examined AI idea generation in an innovation context. They found that AI systems generate substantially more ideas than human participants in the same timeframe. However, the quality assessment was mixed: while the best AI ideas were competitive with the best human ideas, the average quality was lower, and AI produced more ideas that were rated as poor or unworkable.

An important caveat: this study used GPT-4 era models (2023). AI capabilities have advanced significantly since then, and a similar study conducted today might well find that average AI idea quality also exceeds average human quality. The diversity problem, however, appears to be more structural — it stems from how LLMs generate text through pattern completion, and improvements in raw capability do not necessarily address it.

The practical implication is that AI brainstorming requires heavy curation. You can use AI to generate a large pool of candidate ideas, but the work of evaluating, filtering, and developing those ideas remains fundamentally human. Think of AI as expanding the search space, not as replacing your judgement about what is worth pursuing.

🌱 The Idea Monoculture Problem

If every researcher uses the same AI tools to generate ideas, and those tools tend to converge on similar outputs, the result is a dangerous homogenisation of the research landscape. This is not a hypothetical concern — it is already happening.

📰 Scientific Monoculture

A 2026 paper in Nature Communications Psychology argues that AI is driving research toward topical and methodological convergence. The rush to study AI itself creates a feedback loop: researchers adopt AI methods, which leads them toward AI-compatible questions, which concentrates the field around a narrower range of topics and approaches. The paper warns that this dynamic risks undermining the diversity of inquiry that science depends on.

Read the paper

📊 The Homogenising Effect

A 2025 study published in ScienceDirect examined the homogenising effect of LLMs on creative diversity. The findings are striking: diversity gaps widen as more content is AI-generated. Rather than expanding the range of ideas in circulation, widespread AI use actually narrows it. The more people use AI to write and think, the more similar the outputs become across the board.

Read the paper

The institutional pressure is real. As AI becomes embedded in research workflows, institutions increasingly reward speed and scale — the things AI is good at. This creates pressure to adopt AI-compatible methods and questions, sidelining research approaches that do not lend themselves to AI augmentation. Ethnographic fieldwork, long-term observational studies, and interpretive methods may be deprioritised not because they are less valuable, but because they are harder to accelerate with AI.
The training data problem compounds the issue. All major LLMs are trained on broadly similar corpora of internet text and academic publications. They have absorbed the same biases, the same dominant framings, and the same blind spots. When ten researchers in different countries use different AI tools to brainstorm on the same topic, they will get more similar results than if those ten researchers had brainstormed independently. The tools are different, but the underlying data — and therefore the underlying patterns — are shared.
Diversity of thought requires diversity of process. The most innovative research often comes from researchers who think differently because they have had different experiences, read different literatures, or work in different intellectual traditions. AI flattens these differences by channelling everyone through the same statistical patterns.

⚠️ The core danger: If everyone uses the same AI to generate ideas, everyone will explore the same territory. The research frontier does not advance when thousands of researchers converge on the same set of AI-suggested hypotheses. It advances when researchers bring genuinely different perspectives, methods, and questions to the table. AI can be part of that process, but it cannot be the whole of it — and if it becomes the default starting point for every researcher, the result will be a narrower, less creative, less surprising body of knowledge.

⚠️ The Risks of AI-Assisted Ideation

Beyond the systemic problem of monoculture, there are specific cognitive risks that affect individual researchers who use AI for brainstorming. These are not reasons to avoid AI entirely — but they are risks you must actively manage.

⚓ Anchoring

The risk: Once AI frames a problem or suggests an approach, it becomes the reference point for all subsequent thinking. You evaluate every other idea relative to the AI's first suggestion, rather than exploring the space freely.

How it manifests: You ask AI for research ideas, it suggests studying "the impact of social media on adolescent mental health using sentiment analysis," and now every idea you generate is a variation on that frame — different platforms, different age groups, different methods, but always the same basic shape. The AI has not helped you think; it has told you what to think about.

Mitigation: Always brainstorm on your own first, even briefly, before consulting AI. Write down your initial thoughts so you have an anchor-free baseline to return to.

🔍 Confirmation Bias

The risk: AI tends to agree with you. If you present a research direction and ask the AI whether it is promising, it will almost always find reasons to say yes. LLMs are trained to be helpful, which often means being agreeable rather than honest.

How it manifests: You describe your planned study and ask "Is this a good research question?" The AI responds with enthusiasm, lists reasons your approach is sound, and suggests extensions. It does not tell you that three other groups have already done this work, or that the method you are proposing has known limitations in this context.

Mitigation: Use adversarial prompting deliberately. Ask "What is wrong with this idea?" or "Why would a reviewer reject this?" Force the model into a critical stance rather than a supportive one.

🔨 Premature Convergence

The risk: AI narrows quickly to "the answer" rather than keeping the ideation space open. It is optimised to be decisive and conclusive, whereas good brainstorming requires sustained ambiguity and exploration.

How it manifests: You ask for "ten different approaches to studying X" and the AI gives you ten variations on essentially the same approach. Or you ask an open-ended question and the AI converges immediately on the most common framing, closing down the exploration before it has properly begun.

Mitigation: Explicitly ask for maximally different ideas. Use constraint-based prompting to block the first wave of obvious suggestions. Run multiple separate conversations rather than one long thread, so each starts fresh without the context of previous suggestions.

🎨 Novelty Illusion

The risk: Ideas that feel novel because they are unexpected to you may be well-trodden in another field. AI can create a false sense of discovery by presenting known ideas in unfamiliar packaging.

How it manifests: The AI suggests applying network analysis to your research question, and it feels like a breakthrough. But in the field next door, network analysis has been the standard approach for two decades. The idea is novel to you but not to the literature. This is especially dangerous for early-career researchers who have not yet developed a broad view of adjacent fields.

Mitigation: Before committing to any AI-suggested idea, do a thorough literature search specifically to check whether it has already been done. Ask the AI directly: "Has this approach already been used in [adjacent field]?"

🎲 Loss of Serendipity

The risk: Some of the best research ideas in history came from unplanned encounters — the random conversation at a conference, the unexpected footnote in a paper you were reading for a different reason, the misheard comment that sparked a new line of thinking. AI cannot replicate the unplanned.

How it manifests: When AI becomes the default brainstorming tool, you spend less time in the unstructured states where serendipity occurs. You stop browsing library shelves. You stop attending talks outside your field. You stop having aimless conversations with colleagues. Each of these activities has a lower hit rate than AI brainstorming, but the hits are often more transformative.

Mitigation: Deliberately preserve unstructured time in your research practice. Attend seminars outside your discipline. Read journals you would not normally read. Have lunch with people from different departments. These are not inefficiencies to be optimised away; they are features of a creative research life.

🚶 When NOT to Use AI for Ideation

We have spent most of this session discussing how to use AI well for ideation. But there are times when the best thing you can do for your research is to not use AI at all.

                The value of boredom, long walks, and unstructured time: Cognitive science research consistently shows that some of the most creative thinking happens during periods of low stimulation — when your mind is wandering, when you are bored, when you are not actively trying to solve a problem. This is when the brain makes unexpected connections between apparently unrelated ideas. AI usage occupies exactly the cognitive space that daydreaming needs. Every minute you spend prompting an AI is a minute you are not letting your own subconscious do its work.
            

When you are just starting to explore a new area. If you do not yet have a deep understanding of a field, AI ideas will overwhelm rather than inform. You need to read enough to develop your own intuitions before AI suggestions become useful rather than confusing. Use the literature review tools from Week 5 first. Build understanding before you brainstorm.
When you need genuinely original framing. If your research contribution depends on seeing a problem in a way nobody else has, AI is unlikely to help. AI generates ideas by recombining patterns from existing text. It can produce novel combinations, but it cannot produce a genuinely new way of seeing. That kind of originality comes from deep engagement with a problem over time, from lived experience, and from intellectual traditions that the model may not adequately represent.
When you are stuck in a rut. Counterintuitively, the moment you most want to use AI — when you feel stuck and frustrated — may be the worst time to do so. Being stuck is often a sign that you are about to have a breakthrough, if you sit with the discomfort long enough. Reaching for AI at this point short-circuits the process. Instead, go for a walk. Talk to a colleague. Read something unrelated. Sleep on it.
When your research draws on local or indigenous knowledge. AI models are trained predominantly on English-language, Western academic text. If your research engages with local knowledge systems, community perspectives, or non-Western intellectual traditions, AI brainstorming will pull you toward the dominant framings and away from the perspectives that make your work distinctive and valuable.
When the conversation matters more than the output. Some of the best research ideas emerge from conversations between people — the back-and-forth of challenge and response, the shared excitement of "what if," the social energy of collaborative thinking. AI cannot replicate the generative quality of a good conversation with a colleague who knows your work and pushes your thinking in unexpected directions.

💡 The practical rule: Use AI for ideation as one input among many, never as the starting point, and never as the only source. Brainstorm on your own first. Talk to colleagues. Read widely. Then, and only then, bring in AI to expand the space further. And always evaluate AI suggestions against your own understanding, not the other way around.

📚 Readings

Core Readings

📄 Si, C., et al. (2024). "Can LLMs Generate Novel Research Ideas? A Large-Scale Human Study with 100+ NLP Researchers."

The most rigorous study to date on AI-generated research ideas. Over 100 NLP researchers participated in blind evaluation. Key finding: LLM-generated ideas were rated more novel but less feasible and less diverse than human expert ideas. Essential reading for understanding both the promise and limits of AI ideation.

arXiv:2409.04109

📄 Mollick, E. (2023). "How to Use AI to Do Practical Stuff: A New Guide."

Ethan Mollick's widely-read practical guide to using AI effectively, including for brainstorming and ideation. Focuses on prompting strategies that work in practice, with honest assessment of limitations. Regularly updated with new insights.

One Useful Thing

📄 Nature Communications Psychology (2026). "AI is turning research into a scientific monoculture."

An important recent paper arguing that AI adoption in research is driving convergence in both topics and methods. Directly relevant to the idea monoculture problem discussed in this session. Provides evidence for the homogenisation thesis at the institutional level.

Nature Communications Psychology

Supplementary Readings

White, J., et al. (2023). "A Prompt Pattern Catalog to Enhance Prompt Engineering with ChatGPT."

A systematic catalogue of prompt patterns, organised by purpose. Useful reference for the prompting strategies discussed in Section 1. Provides a framework for thinking about prompts as reusable patterns rather than one-off queries.

arXiv:2302.11382

Meincke, L., Mollick, E.R., & Terwiesch, C. (2024). "Prompting Diverse Ideas: Increasing AI Idea Variance."

Directly addresses the convergence problem: how to get more diverse outputs from AI brainstorming. Offers practical prompting strategies for mitigating the idea monoculture issue identified in the Si et al. study.

arXiv:2402.01727

Girotra, K., Meincke, L., Terwiesch, C., & Ulrich, K.T. (2023). "Ideas Are Dimes a Dozen: Large Language Models for Idea Generation in Innovation."

An early study examining AI idea generation in an innovation context. Finds that AI generates more ideas but quality assessment is mixed, with implications for how we use AI brainstorming in research.

SSRN

Sourati, Z. & Evans, J.A. (2023). "Accelerating science with human-aware artificial intelligence." Nature Human Behaviour.

Explores how AI can complement rather than replace human scientific reasoning. Argues for "human-aware" AI systems that account for human cognitive strengths and weaknesses, rather than simply maximising AI output.

arXiv:2306.01495

Sakana AI (2026). "Towards end-to-end automation of AI research." Nature.

The AI Scientist: a system that generates research ideas, writes code, runs experiments, drafts papers in LaTeX, and performs its own peer review. One AI-generated paper passed peer review at the ICLR 2025 workshop, scoring higher than 55% of human papers — though none passed the main conference bar. A landmark demonstration of end-to-end automated research, raising profound questions about what "doing research" means.

Nature (2026) | arXiv:2408.06292

ScienceDirect (2025). "Homogenizing effect of LLMs on creative diversity."

Empirical study demonstrating that diversity gaps widen with increased AI-generated content. Important evidence for the idea monoculture problem discussed in Section 3.

ScienceDirect

Key Takeaways

Prompting strategy matters enormously. Generic prompts produce generic ideas. Chain-of-thought, role-playing, constraint-based, adversarial, and cross-disciplinary prompting each serve different purposes and produce different kinds of output. Use them deliberately and in combination.
AI-generated ideas can be surprisingly novel, but feasibility is a weakness. The Si et al. (2024) study showed that LLM ideas score higher on novelty than human expert ideas, but lower on feasibility and diversity. AI is better at "What if?" than "Will this actually work?"
Idea monoculture is a real and growing problem. When everyone uses the same AI tools, everyone converges on the same ideas. Research on the homogenising effect of LLMs suggests that widespread AI adoption is narrowing rather than broadening the diversity of ideas in circulation.
The cognitive risks are specific and manageable. Anchoring, confirmation bias, premature convergence, novelty illusion, and loss of serendipity are all real dangers. But each can be mitigated with deliberate practices: brainstorming alone first, using adversarial prompts, running multiple separate conversations, verifying novelty against the literature, and preserving unstructured time.
Sometimes the best tool is no tool at all. Long walks, boredom, conversations with colleagues, and reading outside your field have produced more transformative research ideas than any brainstorming method. Do not optimise away the unstructured thinking time that makes genuine creativity possible.

                    👉 Up next: In Sub-Lesson 3, we move from ideation to execution — AI writing tools. We will cover how AI can assist with drafting, editing, and structuring academic text, the risks of over-reliance on AI prose, and how to maintain your own voice and critical thinking throughout the writing process.